33 research outputs found

    Association of oral bacteria with oral hygiene habits and self-reported gingival bleeding

    Get PDF
    Aim To describe associations of gingival bacterial composition and diversity with self-reported gingival bleeding and oral hygiene habits in a Norwegian regional-based population. Materials and Methods We examined the microbiome composition of the gingival fluid (16S amplicon sequencing) in 484 adult participants (47% females; median age 28 years) in the Respiratory Health in Northern Europe, Spain and Australia (RHINESSA) study in Bergen, Norway. We explored bacterial diversity and abundance differences by the community periodontal index score, self-reported frequency of gingival bleeding, and oral hygiene habits. Results Gingival bacterial diversity increased with increasing frequency of self-reported gingival bleeding, with higher Shannon diversity index for “always” β = 0.51 and “often” β = 0.75 (p < .001) compared to “never” gingival bleeding. Frequent gingival bleeding was associated with higher abundance of several bacteria such as Porphyromonas endodontalis, Treponema denticola, and Fretibacterium spp., but lower abundance of bacteria within the gram-positive phyla Firmicutes and Actinobacteria. Flossing and rinsing with mouthwash twice daily were associated with higher total abundance of bacteria in the Proteobacteria phylum but with lower bacterial diversity compared to those who never flossed or never used mouthwash. Conclusions A high frequency of self-reported gingival bleeding was associated with higher bacterial diversity than found in participants reporting no gingival bleeding and with higher total abundance of known periodontal pathogens such as Porphyromonas spp., Treponema spp., and Bacteroides spp.publishedVersio

    Overview of data preprocessing for machine learning applications in human microbiome research

    Get PDF
    Although metagenomic sequencing is now the preferred technique to study microbiome-host interactions, analyzing and interpreting microbiome sequencing data presents challenges primarily attributed to the statistical specificities of the data (e.g., sparse, over-dispersed, compositional, inter-variable dependency). This mini review explores preprocessing and transformation methods applied in recent human microbiome studies to address microbiome data analysis challenges. Our results indicate a limited adoption of transformation methods targeting the statistical characteristics of microbiome sequencing data. Instead, there is a prevalent usage of relative and normalization-based transformations that do not specifically account for the specific attributes of microbiome data. The information on preprocessing and transformations applied to the data before analysis was incomplete or missing in many publications, leading to reproducibility concerns, comparability issues, and questionable results. We hope this mini review will provide researchers and newcomers to the field of human microbiome research with an up-to-date point of reference for various data transformation tools and assist them in choosing the most suitable transformation method based on their research questions, objectives, and data characteristics

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach

    Indoor Airborne Microbiome and Endotoxin: Meteorological Events and Occupant Characteristics Are Important Determinants

    Get PDF
    Airborne bacteria and endotoxin may affect asthma and allergies. However, there is limited understanding of the environmental determinants that influence them. This study investigated the airborne microbiomes in the homes of 1038 participants from five cities in Northern Europe: Aarhus, Bergen, Reykjavik, Tartu, and Uppsala. Airborne dust particles were sampled with electrostatic dust fall collectors (EDCs) from the participants' bedrooms. The dust washed from the EDCs' clothes was used to extract DNA and endotoxin. The DNA extracts were used for quantitative polymerase chain (qPCR) measurement and 16S rRNA gene sequencing, while endotoxin was measured using the kinetic chromogenic limulus amoebocyte lysate (LAL) assay. The results showed that households in Tartu and Aarhus had a higher bacterial load and diversity than those in Bergen and Reykjavik, possibly due to elevated concentrations of outdoor bacterial taxa associated with low precipitation and high wind speeds. Bergen-Tartu had the highest difference (ANOSIM R = 0.203) in β diversity. Multivariate regression models showed that α diversity indices and bacterial and endotoxin loads were positively associated with the occupants' age, number of occupants, cleaning frequency, presence of dogs, and age of the house. Further studies are needed to understand how meteorological factors influence the indoor bacterial community in light of climate change

    Advancing microbiome research with machine learning : key findings from the ML4Microbiome COST action

    Get PDF
    The rapid development of machine learning (ML) techniques has opened up the data-dense field of microbiome research for novel therapeutic, diagnostic, and prognostic applications targeting a wide range of disorders, which could substantially improve healthcare practices in the era of precision medicine. However, several challenges must be addressed to exploit the benefits of ML in this field fully. In particular, there is a need to establish "gold standard" protocols for conducting ML analysis experiments and improve interactions between microbiome researchers and ML experts. The Machine Learning Techniques in Human Microbiome Studies (ML4Microbiome) COST Action CA18131 is a European network established in 2019 to promote collaboration between discovery-oriented microbiome researchers and data-driven ML experts to optimize and standardize ML approaches for microbiome analysis. This perspective paper presents the key achievements of ML4Microbiome, which include identifying predictive and discriminatory 'omics' features, improving repeatability and comparability, developing automation procedures, and defining priority areas for the novel development of ML methods targeting the microbiome. The insights gained from ML4Microbiome will help to maximize the potential of ML in microbiome research and pave the way for new and improved healthcare practices

    Contemporary Challenges and Solutions

    Get PDF
    CA18131 CP16/00163 NIS-3317 NIS-3318 decision 295741 C18/BM/12585940The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 “ML4Microbiome” that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies.publishersversionpublishe

    Machine learning approaches in microbiome research: challenges and best practices

    Get PDF
    Microbiome data predictive analysis within a machine learning (ML) workflow presents numerous domain-specific challenges involving preprocessing, feature selection, predictive modeling, performance estimation, model interpretation, and the extraction of biological information from the results. To assist decision-making, we offer a set of recommendations on algorithm selection, pipeline creation and evaluation, stemming from the COST Action ML4Microbiome. We compared the suggested approaches on a multi-cohort shotgun metagenomics dataset of colorectal cancer patients, focusing on their performance in disease diagnosis and biomarker discovery. It is demonstrated that the use of compositional transformations and filtering methods as part of data preprocessing does not always improve the predictive performance of a model. In contrast, the multivariate feature selection, such as the Statistically Equivalent Signatures algorithm, was effective in reducing the classification error. When validated on a separate test dataset, this algorithm in combination with random forest modeling, provided the most accurate performance estimates. Lastly, we showed how linear modeling by logistic regression coupled with visualization techniques such as Individual Conditional Expectation (ICE) plots can yield interpretable results and offer biological insights. These findings are significant for clinicians and non-experts alike in translational applications

    Applications of Machine Learning in Human Microbiome Studies: A Review on Feature Selection, Biomarker Identification, Disease Prediction and Treatment

    Get PDF
    The number of microbiome-related studies has notably increased the availability of data on human microbiome composition and function. These studies provide the essential material to deeply explore host-microbiome associations and their relation to the development and progression of various complex diseases. Improved data-analytical tools are needed to exploit all information from these biological datasets, taking into account the peculiarities of microbiome data, i.e., compositional, heterogeneous and sparse nature of these datasets. The possibility of predicting host-phenotypes based on taxonomy-informed feature selection to establish an association between microbiome and predict disease states is beneficial for personalized medicine. In this regard, machine learning (ML) provides new insights into the development of models that can be used to predict outputs, such as classification and prediction in microbiology, infer host phenotypes to predict diseases and use microbial communities to stratify patients by their characterization of state-specific microbial signatures. Here we review the state-of-the-art ML methods and respective software applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on the application of ML in microbiome studies related to association and clinical use for diagnostics, prognostics, and therapeutics. Although the data presented here is more related to the bacterial community, many algorithms could be applied in general, regardless of the feature type. This literature and software review covering this broad topic is aligned with the scoping review methodology. The manual identification of data sources has been complemented with: (1) automated publication search through digital libraries of the three major publishers using natural language processing (NLP) Toolkit, and (2) an automated identification of relevant software repositories on GitHub and ranking of the related research papers relying on learning to rank approach

    Statistical and Machine Learning Techniques in Human Microbiome Studies: Contemporary Challenges and Solutions

    Get PDF
    The human microbiome has emerged as a central research topic in human biology and biomedicine. Current microbiome studies generate high-throughput omics data across different body sites, populations, and life stages. Many of the challenges in microbiome research are similar to other high-throughput studies, the quantitative analyses need to address the heterogeneity of data, specific statistical properties, and the remarkable variation in microbiome composition across individuals and body sites. This has led to a broad spectrum of statistical and machine learning challenges that range from study design, data processing, and standardization to analysis, modeling, cross-study comparison, prediction, data science ecosystems, and reproducible reporting. Nevertheless, although many statistics and machine learning approaches and tools have been developed, new techniques are needed to deal with emerging applications and the vast heterogeneity of microbiome data. We review and discuss emerging applications of statistical and machine learning techniques in human microbiome studies and introduce the COST Action CA18131 "ML4Microbiome" that brings together microbiome researchers and machine learning experts to address current challenges such as standardization of analysis pipelines for reproducibility of data analysis results, benchmarking, improvement, or development of existing and new tools and ontologies

    Burden of disease attributable to risk factors in European countries: a scoping literature review

    Get PDF
    Objectives: Within the framework of the burden of disease (BoD) approach, disease, and injury burden estimates attributable to risk factors are a useful guide for policy formulation and priority setting in disease prevention. Considering the important differences in methods, and their impact on burden estimates, we conducted a scoping literature review to: (1) map the BoD assessments including risk factors performed across Europe, and (2) identify the methodological choices in comparative risk assessment (CRA) and risk assessment methods. Methods: We searched multiple literature databases, including grey literature websites, and targeted public health agencies' websites. Results: A total of 113 studies were included in the synthesis and further divided into independent BoD assessments (54 studies) and studies linked to the Global Burden of Disease (59 papers). Our results showed that the methods used to perform CRA varied substantially across independent European BoD studies. While there were some methodological choices that were more common than others, we did not observe patterns in terms of country, year, or risk factor. Each methodological choice can affect the comparability of estimates between and within countries and/or risk factors since they might significantly influence the quantification of the attributable burden. From our analysis, we observed that the use of CRA was less common for some types of risk factors and outcomes. These included environmental and occupational risk factors, which are more likely to use bottom-up approaches for health outcomes where disease envelopes may not be available. Conclusions: Our review also highlighted misreporting, the lack of uncertainty analysis, and the under-investigation of causal relationships in BoD studies. Development and use of guidelines for performing and reporting BoD studies will help understand differences, and avoid misinterpretations thus improving comparability among estimates.info:eu-repo/semantics/publishedVersio
    corecore